Unsolved Questions (UQ) Project

About Unsolved Questions

Benchmarks shape progress in AI research. A useful benchmark should be both difficult and realistic: questions should challenge frontier model while also reflecting real-world usage. Yet, current paradigms face a difficulty–realism tension: exam-style benchmarks are often made artificially difficult with limited real-world value, while benchmarks based on real user interaction often skew toward easy, high-frequency problems.

This work explores a radically different paradigm: assessing models on unsolved questions. Rather than a static benchmark scored once, we curate unsolved questions and evaluate models asynchronously over time with validator-assisted screening and community verification. We introduce , a testbed of 500 challenging, diverse questions sourced from Stack Exchange, spanning topics from CS theory and math to less explored areas like sci-fi and history, probing capabilities including reasoning, factuality, and browsing. is difficult and realistic by construction: unsolved questions are often hard and naturally arise when humans seek answers, thus solving them yields direct real-world value.

All: 500 questions

Technology: 52 questions

Culture & Recreation: 16 questions

Life & Arts: 35 questions

Science: 395 questions

Model Performance Leaderboard

Models are ranked by number of questions that pass human verification.

Dataset Version:

Rank	System	Organization	UQ-Validator Pass Rate	All Questions	Technology	Culture & Recreation	Life & Arts	Science
#1	o3 Pro	OpenAI	75 / 500 (15.0%)	4 / 500*	0 / 52	0 / 16	0 / 35	4 / 395
#2	Gemini 2.5 Pro	Google	25 / 500 (5.0%)	3 / 500*	0 / 52	0 / 16	0 / 35	3 / 395
#3	o4 mini	OpenAI	25 / 500 (5.0%)	2 / 500*	0 / 52	0 / 16	0 / 35	2 / 395
#4	o3	OpenAI	44 / 500 (8.8%)	1 / 500*	1 / 52	0 / 16	0 / 35	0 / 395
#5	DeepSeek R1	DeepSeek	11 / 500 (2.2%)	1 / 500*	0 / 52	0 / 16	0 / 35	1 / 395
#6	GPT-5	OpenAI	88 / 500 (17.6%)	0 / 500*	0 / 52	0 / 16	0 / 35	0 / 395
#7	Claude Opus 4	Anthropic	7 / 500 (1.4%)	0 / 500*	0 / 52	0 / 16	0 / 35	0 / 395
#8	Claude 3.7 Sonnet	Anthropic	6 / 500 (1.2%)	0 / 500*	0 / 52	0 / 16	0 / 35	0 / 395
#9	K2-Think	MBZUAI-IFM	0 / 498 (0.0%)	0 / 500*	0 / 52	0 / 16	0 / 35	0 / 395

Total Questions

500

Models Evaluated

Questions Solved by Models

Stack Exchange URL Mirroring

Access UQ questions directly using Stack Exchange URLs

Found an interesting Stack Exchange unsolved question? You can check if it's in our UQ dataset by modifying the URL:

Original Stack Exchange URL:

https://math.stackexchange.com/questions/358423

becomes

UQ Mirrored URL:

https://uq.stanford.edu/q/math.stackexchange.com/questions/358423

✅ If question exists in UQ:

You'll be automatically redirected to the UQ question page with model answers and analysis.

📝 If question not found:

You can submit a request to have it considered for inclusion in our dataset.

Tip: Both short URLs (without title) and full URLs (with title) work the same way!

A proof of without prime ideals?

Unsolved

Background. If is a commutative ring, it is easy to prove , where denotes the Krull dimension. If is Noetherian, we have equality. Every proof of this fac...

Science

Mathematics

ring-theory

commutative-algebra

noetherian

+2 more

Posted on:4/11/2013

UQ ID:256

View Original View Details

632

SE votes

Is there a bijection of with itself such that the forward map is connected but the inverse is not?

Unsolved

Let be two topological spaces. We say that a map between their power sets is connected if for every connected, ...

Science

Mathematics

general-topology

metric-spaces

examples-counterexamples

+1 more

Posted on:9/30/2014

UQ ID:257

View Original View Details

185

SE votes

Does every ring of integers sit inside a ring of integers that has a power basis?

Unsolved

Given a finite extension of the rationals, , we know that by the primitive element theorem, so every has the form ...

Science

Mathematics

abstract-algebra

number-theory

ring-theory

+1 more

Posted on:4/22/2016

UQ ID:258

View Original View Details

142

SE votes

If polynomials are almost surjective over a field, is the field algebraically closed?

Unsolved

Let be a field. Say that polynomials are almost surjective over if for any nonconstant polynomial , the image of the map contains all but finitely many points of ....

Science

Mathematics

abstract-algebra

polynomials

field-theory

Posted on:5/19/2016

UQ ID:259

View Original View Details

135

SE votes

Probability for an matrix to have only real eigenvalues

Unsolved

Let be an random matrix where every entry is i.i.d. and uniformly distributed on . What is the probability that has only real eigenvalues? The answer cannot be or , s...

Science

Mathematics

linear-algebra

probability

matrices

+2 more

Posted on:7/27/2020

UQ ID:260

View Original View Details

122

SE votes

A question about divisibility of sum of two consecutive primes

Unsolved

I was curious about the sum of two consecutive primes and after proving that the sum for the odd primes always has at least 3 prime divisors, I came up with this question:

Find the least natural numb...

Science

Mathematics

number-theory

prime-numbers

prime-gaps

Posted on:10/15/2013

UQ ID:261

View Original View Details

112

SE votes

What is the largest volume of a polyhedron whose skeleton has total length 1? Is it the regular triangular prism?

Unsolved

Say that the perimeter of a polyhedron is the sum of its edge lengths. What is the maximum volume of a polyhedron with a unit perimeter? A reasonable first guess would be the regular tetrahedron of si...

Science

Mathematics

geometry

optimization

volume

+2 more

Posted on:3/1/2021

UQ ID:262

View Original View Details

SE votes

Can Erdős-Turán theorem be generalised that way?

Unsolved

Suppose for an arbitrary group word ower the alphabet of symbols is a variety of all groups , that satisfy an identity ....

Science

Mathematics

combinatorics

group-theory

finite-groups

+2 more

Posted on:1/11/2019

UQ ID:263

View Original View Details

SE votes

Complete, Finitely Axiomatizable, Theory with 3 Countable Models

Unsolved

Does there exist a complete, finitely axiomatizable, first-order theory with exactly 3 countable non-isomorphic models? A few relevant comments: There is a classical example of a complete theory w...

Science

Mathematics

logic

model-theory

Posted on:8/29/2014

UQ ID:264

View Original View Details

SE votes

Regular way to fill a square with rectangles?

Unsolved

The series suggests it might be possible to tile a square with nonrepeated rectangles of the form . Is there a know...

Science

Mathematics

sequences-and-series

visualization

egyptian-fractions

Posted on:2/24/2015

UQ ID:265

View Original View Details

News

[08/2025] Released Unsolved Questions (UQ) Paper

Contact

For questions about the project:

{"niefan, kzliu, niklasm"}@stanford.edu

For technical issues:

niefan@stanford.edu

Cite

If you use UQ: Assessing Language Models on Unsolved Questions, please cite:

@misc{nie2025uqassessinglanguagemodels,
  title={UQ: Assessing Language Models on Unsolved Questions}, 
  author={Fan Nie and Ken Ziyu Liu and Zihao Wang and Rui Sun and Wei Liu and Weijia Shi and Huaxiu Yao and Linjun Zhang and Andrew Y. Ng and James Zou and Sanmi Koyejo and Yejin Choi and Percy Liang and Niklas Muennighoff},
  year={2025},
  eprint={2508.17580},
  archivePrefix={arXiv},
  primaryClass={cs.CL},
  url={https://arxiv.org/abs/2508.17580}
}

Full Citation